(19) 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 




(12) 



(43) Date of publication: 

05.07.2000 Bulletin 2000/27 



(n) EP 1 016 980 A2 

EUROPEAN PATENT APPLICATION 

(51) mtci 7: G06F 15/173, H04L 12/56 



(21) Application number: 99309766.6 

(22) Date of filing: 06.12.1999 



(84) Designated Contracting States: 


(72) Inventors: 


AT BE CH CY DE DK ES Fl FR GB GR IE IT LI LU 


• McMillen, Robert James 


MC NL PT SE 


Carlsbad, CA 92009 (US) 


Designated Extension States: 


♦ Nguyen, Chinh Kim 


AL LT LV MK RO SI 


San Diego, CA92131 (US) 


(30) Priority: 22.12.1998 US 218954 


(74) Representative: Cleary, Fidelma et al 


(71) Applicant: NCR INTERNATIONAL INC. 


International IP Department 


NCR Limited 


Dayton, Ohio 45479 (US) 


206 Maryiebone Road 




London NW1 6LY (GB) 



(54) Distributed multi-fabric interconnect 

(57) An interconnect network having a plurality of 
identical fabrics partitions the switching elements of the 
fabrics, so that many links can be combined into single 
cables. In the partitioning method, one or more of the 
switching elements from the first stage of each of the 
fabrics are physically packaged onto the same board 
called a concentrator, and these concentrators are 
physically distributed among the processing nodes con- 
nected to the interconnect network. The concentrator al- 



lows all the links from each processing node to a con- 
centrator, each of which need to be connected to differ- 
ent fabrics, to be combined into a single cable. Further- 
more, the concentrator allows all the links from a single 
switching element in the first stage to be combined into 
a single cable to be connected to the subsequent or ex- 
pansion (second and higher) stages of the fabric. The 
subsequent or expansion stages of each fabric can be 
implemented independently of other fabrics in a central- 
ized location. 
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T00031 a Processors communicate with each 

[0004] Practical implementations favor modularitvHpn, , V'^and 

rrr y ac 0 : r wa r ay not te <^ s ^nVr; r i,ab,e - perhaps ™ ^zzsz 

performance must scate linearis Z Z^TZT^ h CharaC,eris,ic No « "V connect^ scale but 
m«j > processing nodes, where each n^ZSZ^^T^ * Can from to 1S4 0 " 
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[0008] Another problem with MPP systems result fmm «, 

em manufacturers no longer design commodrtization of processor hardware. Computer Svs 

typ^lly comprised of large collections of prcc^ 
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through fabric replication in a cost-effective manner. 

[001 3] From a first aspect the present invention resides in an interconnection network comprising a plurality of iden- 
tical fabrics for interconnecting a plurality of processing nodes for communication therebetween, each of the fabrics 
comprised of at least one stage, each stage comprised of a plurality of switching elements, one or more of the switching 
elements from a first stage of each of the fabrics being combined together in at least one concentrator, the concentrator 
allowing all links from each processor to the fabrics to be combined into a single cable coupled to the concentrator. 
[0014] From a further aspect the invention resides in a massively parallel processing system comprising the above 
interconnection network. The invention also resides in a concentrator for the above interconnect bn network. 
[0015] The present invention provides a method for partitioning the switching elements of multiple fabrics, so that 
many links can be combined into single cables, thereby enabling higher density packaging and making the implemen- 
tation of multiple fabrics practical. The partitioning method discbsed is applicable to any multistage interconnect con- 
structed from a x b bidirectional switching elements, where a>1 , b>0 or a>0, b>1 . According the present invention, one 
or more of the switching elements from the first stage of each of several identical fabrics are physically packaged on 
to the same board called a concentrator, and these concentrators are physically distributed among the processing 
nodes. 

[0016] This concentrator approach allows all the links from each processing node to a concentrator, each of which 
need to be connected to different fabrics, to be combined into a single cable. Furthermore, it allows all the links ema- 
nating from a single switching element in the first stage to be combined into a single cable to be connected to the 
second and subsequent stages of that fabric in larger configurations. 

[0017] The subsequent or expansion stages (second and higher) of each fabric can be implemented independently 
of other fabrics in a centralized location. This partitioning of the collection of all the fabrics in the interconnect is what 
leads to all the benefits that have been described. 

[001 8] Since it is typically the physical size of the cable connectors that limits the packaging density of interconnects, 
not the switching electronics, this leads to high density packaging of individual fabrics, allowing cost-effective deploy- 
ment of multi-fabric interconnects. 

[001 9] The invention is advantageous in that it leads to reduction of the cable count in MPP systems, and also eases 
the installation effort. Moreover, implementation of the interconnect is distributed, so that the switching hardware can 
consume otherwise unused space, power and cooling resources by being co-located with processor hardware. 
[0020] An embodiment of the invention will now be described with reference to the accompanying drawings in which 
like reference numbers represent corresponding parts throughout: 

FIG. 1 A illustrates a generic bidirectional ax b crossbar switching element and FIGS. 1B, 1C, and 1D illustrate 
three possible implementations of the element: 

FIG. 2 illustrates a multistage fabric constructed from axb switching elements, wherein a, b, and n are positive 
integers anda + b> 3; 

FIG. 3 illustrates an example of a three stage fabric constructed from 2x3 switching elements; 

FIG. 4 illustrates a concentrator containing the^ stage 0 switching element from each of /(different fabrics; 

FIG. 5 illustrates a application-specific integrated circuit (ASIC) implementing a bidirectional switch node; 

FIG. 6 illustrates a two stage interconnect implementing a folded banyan topology, which shows the typical logical 

interconnect wiring pattern of a 64 port MPP fabric; 

FIG. 7 shows the logical connection between the processing nodes and four fabrics; 

FIG. 8 illustrates the partitioning of switches from multiple fabrics to form a concentrator, and also shows the logical 
connections between a processing node and four fabrics; 

FIG. 9 illustrates a four fabric concentrator with 8x8 switching elements, including the arrangement of crossbar 
switches and wiring connection on the concentrator; 

FIG. 10 illustrates the logical connection of an eight node cluster with a single concentrator of four fabrics; and 
FIG. 11 shows the arrangement of crossbar switches and wiring connection for the second stage of a 64x64 port 
fabric wherein the second stage is divided into four printed circuit boards and they communicate with each other 
through a back plane. 

Massively Parallel Processing System 

[0021] Without loss of generality, a typical MPP system can be considered to be comprised of an interconnection 
network, a number of processing nodes, and mass storage attached to the nodes. In an architecture in which storage 
is attached to the interconnect, storage can be considered just another node from the point of view of the interconnect. 
[0022] In highly reliable interconnect implementations, two fabrics are provided for redundancy. If both fabrics are 
active, higher performance also results. 

[0023] The partitioning method taught by this invention is broadly applicable to a very large class of interconnects. 
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[0031] Any interconnection network with 
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value ranges from 0 to a-1 and p represent a 6-ary digit whose value ranges from 0 to 5-1 , then Xj$ ht = [a n . # . 2 ...a 1 o<oP- 
P1P0I Sucn a notation is referred to as a mixed radix representation. r 
[0036] For notational convenience, digits of the same arity are grouped together; however, the only requirement is 
that the least significant digit of a right side level must be b-ary (i.e., a p). The other digits can appear in any order; 
however, the same order should be used to identify every level in the same stage. A left side port in the /» h stage is 
represented as: A^ ft = [p M ---Mo a frM-»a 1 c< 0 ]. In this case, the least significant digit must be an a. 
[0037] The number of right side ports in stage /must be equal to the number of left side ports in stage y+1 so that a 
permutation of the links can be formed. That is equivalent to determining that the maximum value representable by 
each Xis the same. Thus, the relationship, MAX( ^. ht ) = MAX( ), 0 <j< ah , must be true. The following conversion 
formula can be used to verify that this is true: ;+ 



X= [P,...p,po<V..a I o 0 ] = £ fit/a** 1 + £ ap* 



[0038] This is a radix conversion formula in which base r is implicitly used to compute the weighted sum of the mixed 
radix digits representing X. Base r is typically 10, but any base could be used. Just as the maximum value of a four 
digit base 1 0 number is represented by setting all the digits to "9, " the maximum value of X?9 hl and xf? ft can be evaluated 
by setting py- 6-1, 0 <y < i, and a y = a-1, 0 </<n-/-2, in each mixed radix representation, respectively. This yields the 
following relationship to be verified: 

J=0 J=0 J=0 y=0 

[0039] Using the fact that 

d and pany positive integers, the above relationship is simplified to 



/>! 1 „ n-i-l 



(a _ iy> « +(6-1) = (* " W Vr 1 Ha -I) 



a 



a-l x ' b-\ v ' b-\ v " a-l 

[0040] It can be readily verified that both equations reduce to a^^-l . Since counting starts at 0, this means there 
are a^^th^ total links between stages / and A-1 as was stated earlier. Furthermore, it can be shown that this is true 
for any permutation of the mixed radix digits. 

[0041] To reference a specific left side I/O port on a specific switching element using the first method, the notation 
(* P/-1- -PiPo a ^M • ot i ot o)ieft is used, and by the second method, (i Pm-PiPooWi- 011 '^)\ e n- Note that the switch el- 
ement number would be evaluated as 



P,-i...p.poa„.,, 1 ...a, = Ip/V-'" 1 +"2a > fl'- 1 . 

7=0 J=l 

The formula has been modified to take into account the fact that the a subscripts start at>=1 , not 0, so the proper power 
of a is used. In a similar fashion, for a right side port, the first method specifies (t a^ 2 -- a i«oPi --PiPoWt and tne 
second, (t a„.^...a 1 a 0 |J^.p 1 :fe) righ| . 

[0042] For a given value of /, if the subscript of any digit evaluates to a negative number, none of the digits of that 
radix exist in that number. Also, the subscript expression containing /is the highest order digit of that radix in the number. 
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neces (|y thgt shQwn ^ wi^r J e ^2!? iS. ^ be h SOme radi * <** S 

address" dfe f ' PermUta,i °" ^ ^^'T? V8d T ° C ° mP,ete ,h8 ,rans, «n o 

address ,s moved to the least significant digit position one pe r s ,aa^ T,t h!' 9 ^ S ° that ^ a in lhe °"ginal 

*> nurnfe, of Sagas i„ ,„. i„ 16 , c0 „ nM s!p£*SS tan Z ? *** S1 * 9 * ^ »WP' » ,hl 

are (P ..p f^-i ... ^ i ^ A' ^fef ^ " the ,eaSt Si ^« SSS 

a A.racercan be used to tra ck S to^^ZSLT T™""- ' " USefUl l ° *• concept of 
of a sequence of n digits which each reprTsen^^^^^^ 

2! ,hS SeqUence of superscripts shown T££££SZ ^ 3 mixed radi * "umber. A trace is 

[0050] For example, consider £5 and fcS rSmSS^S^ '^? ,os f n,ations - '-O- K"-1).(^2)....i....2,1.0J 
Djgit number 3 (the fourth most significant dtiO is i t S 2 are re P re ^nted by (aWpi «») 

The i npu, tracer is [43210, (commas are om ^ 

single decmal digit). The effect of the first permutation on ' racer when eacr > digit position can be represented bv a 
w.ll be used in .,eu of superscripts *r2^^^W«»»««t 
tracer d ig ilswill besubscriptedwjtha ; 

I 4 a 2 p1p0«3 a ], respectivelv. H -naicate, e.g., [4 a 3 a 2 B 1 R 0 R maos to io^^a o i ^ 



ST*' 5 al ,he "" ** pon * 01 s,a s e '*"" 01 5,398 ' an0 *° " s '* m «'p« 



nput tracer. Hence, a tracer tha, started in b^^S^Z^T 1 ^ ^ " b n0t feintelized 10 * an 
than one that originated a, the right side ports I s^i "™ " at * he ,e » sid * P^s in stage h1 
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Three Fabric Types 

[0054] The relationship between a and b can take three forms, each of which defines a different class of interconnect 
If a<b, a trapezoidal shaped fabric is formed in which there are b" paths between every pair of a" fabric left side ports 
When implemented as a FORM I fabric, there are more paths internal to the fabric than external. Assuming a message 
routing scheme that exploits this property, this class of fabrics would have less internal contention among messages 
which would produce lower latency and higher throughput. A FORM II version of this class would be suitable for an 
architecture in which storage is attached to the interconnect. In cases in which the ratio of storage nodes to processing 
nodes is greater than one : processor nodes would be attached to the left side and storage nodes to the right If the 
converse were true, the attachment sides would be reversed. 

[0055] If a>b, a fabric is formed that some in the literature have referred to as a "fat tree." If b= 1 , an a-ary tree results. 
If a=2, a classic binary tree is obtained. This class of fabrics is typically implemented as FORM I. The NCR Y-Net is 
an example of a FORM I binary tree. 

[0056] The third and most common class is that in which a=b. In this case, the switching elements are "square" 
having equal numbers of ports on each side and thus, produce square fabrics. This class is a special case, because 
all digits used in numbers representing levels have the same arity or radix. This leads to simplification of the notation 
needed to describe the characteristics of this class of fabrics. 

Examples 

[0057] For a fabric in which n=l, only one a x b switching element is required, so no permutation functions are 
necessary. 

[0058] If n=2, there are two stages and the fabric is a 2 x i*. Tnere js only one permutation function possible between 
Stage 0 and Stage 1: PERMUTE 2 {ocop 0 } = p^. The corresponding output tracer is [01]. 

[0059] If n=3, there are three stages and the fabric is a^x b 3 . Two permutation functions are needed: PERMUTE 3 
{<*i«oP 0 } and PERMUTE 3 {X}, where X is either in the form o^fo or p^ft,. Of the six possible digit permutations° 
there are four legal/useful possibilities for PERMUTE 3 {^OoW (the input tracer is [2 a 1 a 0 p ]): (I) a^cto ([2 a 0 p 1 J); (II) 
«oPo<*i fl1a0p2J); (HI) Mi<*o«O p 2 a 1 a ]); and (IV) p^o, ([0 p 1 a 2J). (All preceding tracers are single stage.) Notice 
that (I) and (II) are both of the form apa. After passing through the switching element, they will both be of the form app 
Similarly, (III) and (IV) are of the form Poa and will be converted by the switching element they enter to the form pap 
The other two possible digit permutations are a^Po ([2 a 1 a 0p]) and a 0 a 1 p 0 [(1 a 2 a 0 p ]). 

[0060] If a * b, these are both illegal because the least significant digit is a p. In this context, "illegal" means that 
even though the permutation produced is valid, the interconnect that results will not function correctly. There will be a 
mismatch between each set of b links these permutations group together for switching and the a ports available at the 
switch. 

[0061] If a=b, the first of these is just the identity permutation which accomplishes nothing. The second is also not 
useful because the switching element from which this emanated just transformed that digit so it doesn't need to be 
processed again (unless it is desired to introduce redundant paths, but that option is outside the scope of this discus- 
sion). 

[0062] Of the legal permutations, the first is preferred because a, does not change position. That implies the worst 
case physical "distance" the links must span is minimal. 

[0063] There are only two legal possibilities for PERMUTE 3 but which two depends on what was selected for 
PERMUTE 3 {X}. If either (I) or (II) was selected, so the mixed radix representation of the right side port level in stage 
0 is of the form app, then PERMUTE 3 {a^Po} is either p^ ([1 p 0 p 2J) or Pofca* ([0 p 1p2J) f neither of which has 
any particular advantage over the other. If either (III) or (iv) was selected, so the mixed radix representation of the right 
side port level in stage 0 is of the form p«p, then PERMUTE 3 {p, o^) is either p, p^ ([2 p 0 p 1 a ]) or prf,^ ([0 p 2 p 1 a ]). 
[0064] The form of the mixed radix representation for the right side level number, i.e. the order in which the higher 
order (>0) digits appear at the right side ports, has a definite bearing on the topology generated in this stage. This is 
made clear by the tracers which track the movement of the digits. For example, even though P^qcxq is a desired form 
of left side address (of switching elements in stage h-1) for all four possible PERMUTE 3 {X\ permutations, if the form 
of right side address (of switching elements in stage /) is o^p* tracer [1 p 0p2J results. Whereas, if the right side 
address has form p^pQ, tracer [2 p OpIJ results. The tracers show that the same ppot form is achieved, but the digits 
originate from different positions so different permutations are required. 

[0065] These are distinct permutations, but it can be shown that they're topological^ isomorphic. As stage numbers 
increase, there are fewer permutations to choose among because there are fewer unprocessed a's to move into the 
least significant digit position. 

[0066] Suppose PERMUTE 3 {a, aoM = p^a, and PERMUTE 3 {p^Po} = fc Po^ are chosen as the two permu- 
tations to be implemented. The action of the switching elements (x) and permutations (->) can be observed by following 
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a .racer from left side entry to right side exit as follows: 

If g is the , svel of a ,i gh , ^STJSJ^ JttZrrr' «» -2 and te3 

(2)2-3 or 12 possible values that range from 0, n toll 1,1 Z ? 3 S ' de P ° rt in s,a 9 e «™ they eaT .have 
represented in mixed radbc notation by (a^Bjund ^bv^^ ? indiCa,e ,heir ^e or rad?) ? j s 

toTeS.'To t6thedi9te 866 ^ ' he ri9ht Side P° rt * '-e ^ 

room Thf ( . ° f 18 va,ues «™ range fZ 0 to 17 ™ 8 h Pr0Cess ,or s,a 9e 1 is similar. In this case 

[0069] The result.ng fabric is illustrated in FIG 3 whtehZ.w 10 P e "™°n is enumerated in the Table 
ft«n 2 x 3 switching elements. In FIG. 3. pi^^Sf 8 " 8 f of a ,h «* *«• fabric 300 constructed 

the prescribed numberings from Table 1 . y en " ed that the w,rin 9 Patterns in stages 0 and 1 match 

Partitioning fnrr^hu ^ n . n|ir| „■ |n 

[0070] In an interconnect with Kfabriccs aa ^K„ 

nect, with one ,in k per fabric, ^ng " e n ~ 

opportunity to consolidate the links ^^2^2^^^^"" * 3 C6n,ra,,2ed ,ashio " «• pTovials an 

' inkS ^'^ * 38 ^^CiSS? renumb^'? T ^ * * a " d -mtr o 
-^sassocatedwithanodea^^ 

9- as few as 4). The solution is to P aaZ ZZrcl27 T nUmberS °' P ° rtS ' even for nominar £ 
c c-tru„k = cables. The crux of this is that all of Z ZV ~ f different fabrics. Each C-trunk contains 

ImplCTMnlati™ o( ih. P,.i , ^ . „ 
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(labeled as IPLx), Output Port Logic (labeled as OPLx), Diagnostic Port Logic (labeled as DPL), BYNET™ Output 
Ports ) and BYNET™ Input Ports. 

[0075] Selectable loop-back connections are provided internally, as illustrated in FIG. 5. In a preferred embodiment, 
some of the links that traverse short distances are parallel byte wide paths, while those that traverse longer distances 
are serial and use high speed Fibre Channel physical layer components and protocol 

[0076] These crossbar switches are cascaded in multiple stages to achieve the expanded connectivity for any number 
of processing nodes required by the system size. For example, one fabric in a system of 64 nodes would require 2 
stages of 8 crossbar ASICs (16 crossbar ASiCs) and one fabric in a system of 512 nodes would require 3 stages of 
64 crossbar ASICs each (1 92 crossbar ASICs). 

[0077] The crossbar switches are connected with a topology that allows communication between any two end points 
possible according to the methods described earlier. Current packaging technology requires the interconnect to be 
partitioned among multiple printed circuit boards, back planes and cabinets. 

[0078] FIG. 6 illustrates a two stage interconnect 600 implementing a folded banyan topology, which shows the 
typical logical interconnect 600 wiring pattern of a 64 port MPP fabric. 

[0079] FIG. 7 shows the logical connection between a processing node 700 and four fabrics 702. 
[0080] For large configurations, cable management is a significant issue. Consider a 256 processing node system 
and a centralized interconnect with eight fabrics. There are 2048 cables, each typically 30 meters long. Depending on 
the density of the fabric implementation, 256 cables have to egress from the one to four cabinets per fabric. In this 
case, the density of each fabric is usually limited by the size of the connector used by each cable, not by the electronics. 
[0081] Any attempt at cable reduction by placing multiple links into a single multiconductor cable would require all 
fabrics to be physically interleaved. This is because the links associated with one processing node which are physically 
co-located, all go to different fabrics. 

[0082] Given that each fabric must scale incrementally to very large sizes, it becomes impractical to meet that re- 
quirement for multiple fabrics that must be physically interleaved. The concentrator solves this problem by transforming 
the grouping of links from multiple fabrics per cable to multiple links from the same fabric per cable. This then allows 
the portion of each fabric beyond the first stage to be packaged independently of the others. The interconnect in a 
large system resides in multiple cabinets connected together with cables. 

[0083] In the design described in the related applications, a 51 2 node system required 8 cabinets for one fabric. As 
the number of fabrics increases, the physical dimension of the interconnect networks expands significantly. The ex- 
panded dimension may make the distance between the processing node and the interconnect stretch beyond the limits 
permitted by the technology. The number of cables between the interconnect and the processing nodes also increases 
as a multiple of the number of fabrics. 

[0084] The present invention reduces the number of cabinets and the cable counts by distributing the first stage of 
the interconnect networks. The 8x8 crossbar switches of the first stage of each fabric can be located on a new board 
type called a concentrator. Because the concentrator is small, it can occupy a chassis in the processor cabinet for an 
8 node system or in a separate cabinet of multiple concentrators for the larger system . 

[0085] FIG. 8 illustrates the partitioning of switches from multiple fabrics 800 to form a concentrator 802, and also 
shows the logical connections between a processing node 804 and the four fabrics 800. The dotted box representing 
the concentrator 802 separates the switch nodes labeled BISN0 in each fabric 800 and places them on one concentrator 
802 board. The cables (labeled as A, B, C, D) from the processing node 804 to the concentrator 802can now be bundled 
together to reduce the number of individual cables. This is possible because all cables come from the same physical 
source (the processing node 804) and terminate at the same physical destination (the concentrator 802). The 8 outputs 
from switch node BISN0 of each fabric 800 can also be bundled into one cable to go to the next stage. This distribution 
of the first stage replaces 4 long cables between the processing node 804 and the first stages of the four fabrics 800 
with one cable. It also replaces the 8 cables between the first stage and the second stage with a single cable. 
[0086] FIG. 9 illustrates a four fabric concentrator 900 with 8x8 switching elements 902, including the arrangement 
of crossbar switches and wiring connection on the concentrator 900. The four individual cables connecting the process- 
ing node 904 and the first stage switching elements 902 of the four fabrics (not shown) are now bundled into one cable 
906 resulting in a 4-to-1 reduction in cables. On the concentrator 900, the bundles are redistributed and routed to the 
four crossbar switches 902 comprising the first stages of the four fabrics. The outputs of each switch node 902 are 
bundled together at 908 to connect to the second stage resulting in an 8-to-1 reduction in cables. 
[0087] FIG. 10 illustrates the logical connection of an eight node 1000 cluster communicating with a single concen- 
trator 1 002 for four fabrics (not shown). Each of the nodes 1 000 uses a different adapter 1 004 to communicate with a 
different one of the fabrics. 

[0088] FIG. 11 shows the arrangement of crossbar switches 1100 and wiring connection for the second stage of a 
64x64 port fabric. The second stage is comprised of 8 different switching elements 1100 that communicate with 8 
different concentrators (not shown) via 8 bidirectional links per connector 1 1 02. The switching elements 1 1 00 are paired 
together into four printed circuit boards 1104 that communicate with each other through a back plane 1106. 
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Claims 

3- The interconnect network o, Cairn 2. wherein „ and A> 0 . 

4. ^in.erconnec t ionne l workofclaim2,wherein a >0and fe1 . 
S- The interconnection network of claim ? «. 

5 6. T he interconnection network of claim 5 wh 
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18. The interconnection network of claim 1 , wherein the subsequent stages of each fabric are implemented independ- 
ently of other fabrics in a centralized location. 

19. The interconnection network of claim 1 , wherein the concentrators are physically distributed among the processing 
nodes. 

20. A massively parallel processing (MPP) system comprising an interconnection network according to any preceding 
claim. 



21. A concentrator for an interconnection network according to any preceding claim. 
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